Searching a Russian Document Collection using English, Chinese and Japanese Queries
نویسنده
چکیده
As in CLEF 2003, Berkeley experimented with the CLEF Russian Izvestia document collection with monolingual and bilingual runs for the Russian collection. For CLEF 2004 we also experimented with Chinese and Japanese as topic languages, using English as the ‘pivot’ language. For bilingual retrieval our approaches were query translation (for English as a topic language) and ‘fast’ document translation from Russian to English (for Chinese and Japanese translated to English as the topic language). Chinese and Japanese topic retrieval significantly under-performed English Russian retrieval because of the ‘double translation’ loss of effectiveness.
منابع مشابه
Chinese and Korean Topic Search of Japanese News Collections
UC Berkeley participated in the pivot bilingual task of the CLIR track at NTCIR Workshop 4. Our focus was on Chinese and Korean searches against the Japanese News document collection, using English as a pivot language. For comparison of our pivot techniques, we submitted Japanese monolingual and English Japanese bilingual search rankings as well. Two different commercial translation software pa...
متن کاملHow Similar are Chinese and Japanese for Cross-Language Information Retrieval?
For NTCIR Workshop 5 UC Berkeley participated in the bilingual task of the CLIR track. Our focus was on Chinese topic searches against the Japanese News document collection, and on Japanese topic search against the Chinese News Document Collection. Extending our work of NTCIR 4 workshop, we performed search experiments to segment and use Chinese search topics directly as if they were Japanese t...
متن کاملSearch Between Chinese and Japanese Text Collections
For NTCIR Workshop 6 UC Berkeley participated in Phase 1 of the bilingual task of the CLIR track. Our focus was upon Japanese topic search against the Chinese News Document Collection and upon Chinese topic searches retrieving from Japanese News document collection. We performed search experiments to segment and use Chinese search topics directly as if they were Japanese topics and vice versa. ...
متن کاملRMIT and Gunma University at NTCIR-9 GeoTime Task
We participated in the English English and Japanese Japanese subtasks. We selected the Indri search engine as a baseline to test our new class of indexing algorithms. English documents for Indri: Each document was converted to lowercase and written in trec sgml format. We then indexed the collection using Krovetz stemming and stopword removal. English documents for Newt: Each document was conve...
متن کاملWorking with Russian Queries for the GIRT, Bilingual and Multilingual CLEF Tasks
For our activities within the CLEF 2001 evaluation, Berkeley group one participated in the bilingual, multilingual and GIRT tasks focussing on the use of Russian queries. Performance on the Russian queries !English documents bilingual task was excellent, comparable to performance using German queries. For the multilingual task we utilized English as a pivot language between Russian and German a...
متن کامل